Educ. Sci. 2013, 3, 326-343; doi:10.3390/educsci3030326 


OPEN ACCESS 


Article 


education 

sciences 

ISSN 2227-7102 

www.mdpi.com/journal/education 


“Which Child Left Behind”: Historical Issues Regarding Equity 
in Science Assessment 


Peter J. Fensham '* and J. Joy Cumming 2 

1 Monash University and Queensland University of Technology, Kelvin Grove campus, Victoria Park 
Rd, Kelvin Grove, QLD, 4059, Australia 

“ Griffith Institute for Educational Research, Mount Gravatt campus, Griffith University, 

176 Messines Ridge Rd, Mt Gravatt, QLD 4122, Australia; E-Mail: J.Cumming@griffith.edu.au 

* Author to whom correspondence should be addressed; E-Mail: p.fensham@qut.edu.au; 

Tel. +61-7-3138 3338, Fax. +61-7-3138-3985. 


Received: 22 April 2013; in revised form: 24 July 2013/Accepted: 24 July 2013 / 
Published: 7 August 2013 


Abstract: Assessment of learning plays a dominant role in formal education in the forms 
of determining features of curriculum that are emphasized, pedagogic methods that 
teachers use with their students, and parents’ and employers’ understanding of how well 
students have performed. A common perception is that fair assessment applies the same 
mode of assessment and content focus for all students—the approach of assessments in 
international comparative studies of science achievement. This article examines research 
evidence demonstrating that the act of assessment is not neutral—different forms of 
assessment advantage or disadvantage groups of students on the basis of family 
backgrounds, gender, race, or disability. Assessment that implicitly or explicitly captures 
the social capital of the child serves to consolidate, not address, educational equity. 
The article provides an overview of ways that science curriculum focus and assessment can 
introduce bias in the identification of student achievement. It examines the effect of 
changes to curriculum and assessment approaches in science, and relationships between 
assessment of science and the cultural context of the student. Recommendations are 
provided for science-assessment research to address bias for different groups of students. 
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1. Introduction 

At the biennial Symposium of the International Organization for Science and Technology 
Education (IOSTE) in Tunisia in November 2012, a Swedish group presented a symposium that 
explored the issue of “Which Child Left Behind?", an unexpected twist of the aspirational US rhetoric 
in the Bush years about the internationally-known US No Child Left Behind Act (NCLB) of 2001 [1]. 
The question suggests that whenever we revise the emphasis of the curriculum for science, encourage 
new pedagogies, and make use of different assessment practices, some groups of students will be 
disadvantaged and some advantaged—a rather sobering thought for educational innovators. 

Using reported accountability as the driver, one intention of NCLB in US is to ensure focus on 
effective educational provision for all students regardless of racial, ethnic or social class background, 
gender or disability. The underlying philosophy is that all children can learn, and teachers should have 
expectations that all children can achieve high standards [2]. 

Educational reforms, especially those that rely on social indicators such as NCLB’s student 
achievement reporting, can have unintended outcomes [3], and considerable criticism has resulted as to 
the negative effects of NCLB on the learning of all students [4]. While the reforms of NCLB are not 
central to this discussion, they highlight that reforms in education in general, and in assessment more 
specifically, are not neutral acts in education. Change has consequence. 

Research in the field of assessment has noted for some time both positive and negative effects 
of assessment in driving student learning [5-8]. Educational assessment signifies to the learner the 
knowledge that is valued. Positive and negative effects occur regardless of whether assessment is 
occurring at a systemic level, such as by external examination or participation in international testing 
programs such as Trends in International Mathematics and Science Study (TIMSS) or the Programme 
for International Student Assessment (PISA), at a school level, or within classrooms. 

Educational assessment research today focuses not only on valid assessment of student achievement 
but also on assessment that can be used to improve student learning, and assessment for formative 
purposes [9]. Research on effective assessment to improve learning emphasizes three critical points: 
First, students need to understand educational goals underpinning educational assessment and to be 
able to self-evaluate their progress towards these; second, the mode or form through which student 
knowledge is assessed will affect what students learn and how well they are able to demonstrate their 
knowledge; and third, but not least, within classrooms and school environments, interactions among 
teachers and students are critical in effective assessment. Educational assessment at all levels is 
socially-situated practice [10,11]. As noted assessment expert, Patricia Broadfoot, has observed: 
“Assessment is not an exact scientific process—the involvement of human beings in every aspect of its 
design, execution and use makes it irrevocably a social project” [12]. The nature of the knowledge 
itself has been identified as socially-constructed [13,14]. 

When the complexity of assessment practice is considered, its potential to have inequitable impact 
on students is clear [10,15]. Research shows that changes in curriculum can lead to changes in 
assessment directions. A major curriculum reform in science education has been the move to a focus 
on deeper scientific learning and to modes of assessment that in themselves create different 
performance expectations, and outcomes, for students. In this article, we provide a historical overview 
of relationships between students’ learning in and engagement with science, changes in curriculum, 
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and demonstration of achievement through different forms of assessment. We explore the issue of 
cultural or social capital, that is, the experiences that students have in diverse family backgrounds, 
including race (cultural experience/knowledge and language), socio-economic status (SES) and 
gender. Disability, the focus of most recent assessment-equity research, can also be viewed holistically 
from a cultural or social capital framework. Cultural difference underpins language and linguistic 
differences, and experiential learning of students. These considerations can affect students’ science 
learning and demonstration of their scientific understanding, according to the identified focus of 
scientific learning and assessment mode. 

It may be that what follows provides answers to the embarrassing question posed at IOSTE. 

2. Social Capital and Science Education 

A large study of Australian doctoral students in Chemistry in the 1970s led to several unexpected 
findings. The levels of educational qualification of both their fathers and mothers were less than those 
of the whole cohort of students studying first year Science. Seventy-three per cent of fathers and 84% 
of mothers had not matriculated from high school compared with 54% and 67%, respectively, for the 
first year students. On a 1-7 point scale (1 highest), the SES status of fathers of the doctoral students 
clustered between four and six compared with one and four for the first year students [16]. 

The great majority of these doctoral students were male. They had continued directly through 
school into undergraduate studies, and then into doctoral work with little or no study of subjects 
outside of science and mathematics in their final school and undergraduate years. These data paint a 
picture of lower SES boys, first generation university students, who did well in mathematics in 
primary school and then, encouraged by their teachers, continued in this success line into secondary 
school and hence into the physical sciences at university. Unlike their higher SES peers, at the end of 
schooling, they did not consider other professional paths, l ik e medicine, dentistry or engineering, that 
their expertise in science and mathematics could also have enabled. 

The mathematics and sciences these doctoral students studied at school were the then recently 
revised curricula that stemmed from what is now known as the “ALPHABET” era due to the short 
hand titles, such as Biological Sciences Curriculum Study (BSCS) Biology, Chem Study Chemistry 
and Physical Science Study Committee (PSSC) Physics, for these large scale and lavishly funded 
projects developed on both sides of the Atlantic. These revisions brought the school curricula in 
mathematics and the sciences into line with the way these subjects were then being taught at university 
level. The content for learning became more conceptual and more mathematical at the expense of the 
descriptive detail and more historical and applied emphases in earlier curricula [17]. These new 
curricula coincided with a major shift in the mode of assessment from the shorter and longer 
descriptive essays commonly required previously in the sciences to short answer and multiple choice 
items focused on concepts. In retrospect, these changes in content emphasis meant that the abstractness 
of these subjects made them more immune to socio-cultural capital compared with subjects in the 
humanities, the arts and social sciences, and along with the low demand for written work, allowed 
working class boys, with encouragement, to succeed in them. 

This research from nearly four decades ago raises interesting issues about the more recent curricular 
changes being introduced into school science. Since the later 1980s a gradual change in the curricular 
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rhetoric about school science has been to make the links between science and society more explicit. 
Over this 25 year period these intentions have been supported initially by innovative curriculum 
materials generated by the Science/Technology/Society movement [18] and more recently by the use 
of context-based teaching in science classrooms, an approach that begins with a real world context 
involving science and technology [19]. 

The two large international assessment projects for science learning that began in the 1990s, Trends 
in International Mathematics and Science Studies (TIMSS) and Programme for International Student 
Assessment (PISA), have to varying degrees continued to encourage these links between science and 
society. The PISA project particularly emphasizes the links by contextualizing test items in 
socio-scientific situations [20]. Both projects have introduced open-ended items requiring short and 
longer prose answers and minimized mathematical content in their science items. While issues are 
raised later regarding broader cultural and linguistic biases of both testing programs, both projects 
have found that in many countries there is a quite steep SES gradient for performance [20,21]. The 
SES status is based on several variables related to family cultural background. When compounded 
with greater parental experience of education, such a background creates socio-cultural capital that 
may mean some forms of assessing science knowledge and performance advantage the students with 
“richer” socio-cultural capital compared with those who are “poorer”. 

Changes to school science curriculum and assessment could therefore be adding to socio-economic 
inequity in the learning of school science, compounded by newer modes of science assessment. The 
mainstream science curriculum in most countries still gives so much priority to the abstract conceptual 
bases of the sciences that Lyons [22] in Australia, Lindahl [23] in Sweden and Osborne and 
Collins [24], in England, found that by the end of compulsory schooling students reported that they 
had experienced school science as a subject with abstract content that had little relevance to everyday 
life. Nevertheless, the pressure to establish that relevance is now part of the strategy in the more 
developed countries to counter unpopularity of science as a field of interest or for a future career. As 
science teachers gradually adapt their pedagogical practice and their assessment processes to comply 
with these socio-scientific intentions, they may unintentionally create barriers for those students who 
bring less social capital to their classrooms. The question for science assessment is the need to clarify 
its focus—core scientific concepts and understanding; situated scientific understanding; or 
application—and to identify social capital barriers that each aspect of science learning and assessment 
may create for some students. 

3. Gender and Science Achievement 

While TIMSS data present strong evidence of gender difference in science achievement at fourth 
grade and eighth grade, these differences vary according to country and cultural background. While 
little difference exists in Australia, England and the US between boys and girls at fourth grade, by 
eighth grade boys outperform girls in Australia and the US [21]. Many different interactions between 
grade, gender and test data occur across countries. TIMSS’ analyses are not able to interpret the 
reasons for these gender differences. 

The interaction between science reform, assessment and gender, however, is well-exemplified 
in historical changes to science curriculum over the last three decades. School-level Physics is one 
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example. Physics for girls has been problematic for two reasons. The first reason is participation, far 
fewer girls usually undertake the subject. A pattern of girls’ choosing a wider range of subjects in 
senior schooling, both in and out of science disciplines, has persisted for some time [25]. Research 
indicates this is due to a mix of reasons including perception of use of the area of study in a future 
career, general liking or disliking of the subject, and perceived l ik elihood of success. 

A further reason is that assessment patterns in girls’ achievements in Physics differ across 
contexts—in some contexts their performance is consistently higher; in others, their performance is 
consistently lower [25]. The case of the Victorian Certificate of Education (VCE) in Australia provides 
an example of how contextual change in curriculum emphasis and in assessment can affect gender 
achievement in Physics. Reform of the VCE in the late 1980s involved a major revision of the 
academic components of the final two years of schooling and assessments leading to certification for 
competitive selection into university and employment. Victoria’s political climate, with a strong 
feminist and environmental presence, was also conducive to significant reform. The McClintock 
Collective (a network of gender-aware teachers), and other interest groups were well represented on 
the Science Committee responsible for developments to reform Physics curriculum content, 
pedagogical approaches and assessment procedures. Hildebrand [26] has discussed the strategy of the 
McClintock Collective as being drawn from post-structural feminism, informed by research studies 
that suggest girls prefer to learn concepts in their social context rather than abstractly, and to give 
structured and extended response questions. Girls were considered likely to benefit from variety in 
assessed tasks spread throughout the course of study, rather than a single terminal examination. 

The Collective’s voice was not the only one on the Science Committee pressing for a broader sense 
of the purpose of school Physics and, as a result, a number of new aims were adopted for the Physics 
curriculum, including: 

1. becoming aware of Physics as a particular way of knowing about the world which interacts 
with the setting, both social and personal, within which it is pursued; 

2. understanding some of the practical applications of Physics in present and past technologies, 
examining the usefulness of such technologies as well as problems associated with them; 

3. developing capacity and confidence to communicate knowledge of Physics. 

Lists of aims are common for science courses and it is equally common for a number of these aims 
to then be ignored in the assessment process. This is often because they require modes of assessment 
that differ from traditional forms. In the case of VCE Physics (and other sciences at this senior level 
of schooling) these aims were, however, used to introduce two new forms of assessment—Work 
Requirements and Common Assessment Tasks (CATS). 

Work Requirements consisted of six to eight tasks undertaken throughout the last two years of 
Physics in high school. They were intended to make explicit to students what they are learning. In this 
sense they served both metacognitive and formative assessment roles. Guidelines for the tasks 
provided students with opportunities to express their learning in various ways including posters, case 
study reports, student-designed investigations, a file of changing ideas, and so on. Satisfactory 
completion of the tasks was judged by classroom teachers supported by peer moderation, also a 
powerful professional development experience for the teachers. 
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CATS were common assessment tasks across schools spread throughout the second year of Physics 
that provided the basis for a graded assessment of the comparative quality of each student’s learning 
work and were used for certification and university selection. Both CAT 1, an extended practical 
investigation, and CAT 3, a research project of Physics in a social context, required an extended report 
graded by the classroom teacher with peer moderation. CAT 2, comprehension and application of 
Physics, and CAT 4, explanation and modeling in Physics, completed under “test conditions”, were set 
and assessed externally. This way there was genuine endeavor to ensure that each of the intended aims 
of the course were addressed, and hence valued, in assessment. 

As a result, participation in and completion of Physics by girls increased, with mean scores for girls 
becoming significantly greater than for boys, and girls matched or exceeded boys in “A level” passes. 
The content of Physics in the VCE curriculum was very similar to that in the earlier curriculum, but 
the new modes of assessment and their distribution throughout the course of study meant not only that 
assessments may suit girls but also that teachers had to adopt new approaches in their teaching 
and pedagogies for its implementation. Unfortunately, when the student workload of this new VCE 
curriculum was assessed after several years of operation, it was considered too heavy for teachers and 
students. CAT 3, the research project, with its opportunity for extended prose favoring girls, was 
dropped, once more tipping the assessment stakes back in favor of boys. 

Concerns with the lack of girls’ participation and performance in science have resulted in reforms 
in other countries. A different approach was taken by Thailand when it reformed its science curricula 
in the mid 1970s. Its solution to the failure of girls choosing Physics was simple—Physics, Chemistry 
and Biology were all made mandatory for the Science stream in the senior years and the non-Science 
stream was made comparatively less attractive. The result by the 1980s, at least in the large 
metropolitan region around Bangkok, was that Thailand became the first country to report equal 
gender participation and achievement in Chemistry and Physics [27]. About 10% to 14% of girls and 
boys in this region undertook Physics and achieved equally in these senior years. Although 
communication skills were not emphasized, classroom laboratory and inquiry-based learning were the 
foci of the instruction and assessment, with dexterity and report writing of experiments as components 
of the assessment along with theoretical tests, forms and modes that balanced the assessment 
preferences of girls and boys. The Thai solution of removing choice at these senior levels of schooling 
is clearly not “simple” for a variety of reasons in other countries. Nor do other countries have the 
supply of female Physics and Chemistry teachers that exist in Thailand. 

These two case studies demonstrate that curriculum and assessment changes have a clear impact on 
scientific achievement of boys and girls. Again, the decisions to be made are the foci of the Science 
curriculum and the best forms of assessment to identify the knowledge and skills that students have 
achieved. Different forms of assessment not only suit different types of learners, as this section has 
argued on the basis of gendered approaches to assessment, but they also identify different forms of 
knowledge. A strong argument for future Science curricula and their assessment is that all students, 
both boys and girls, should develop the range of scientific knowledge identified through different 
assessment forms. 
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4. Language and Linguistics Demands of Science Education and Assessment 

In this section we consider equity issues associated with linguistic demands of science education 
and assessment. Scientific curriculum reforms have focused not only on the goal of deeper conceptual 
scientific understanding and application in real contexts, but also on scientific communication as a 
critical component of science knowledge. In Queensland, Australia, every science curriculum has a 
communication or language component [28]. Reforms in scientific curriculum, a focus on extended 
responses, and the general demand of language in education and assessment, affect the way that 
students from diverse backgrounds will be able to demonstrate their knowledge and understanding. 
Assessment for learning research highlights the need for clarity of expectations in assessment to guide 
students’ work. The literacy and linguistic demands of science education are complex and change 
across areas of scientific study [29]. Interestingly, an Australian research study has shown that when 
curriculum and assessment literacy and linguistic requirements are made explicit, science teachers 
provide better direct instruction in the language of the discipline than their peers in 
non-science subjects [29]. 

In the 1970s democratic movements in many more developed countries led to a much wider 
cross-section of a country’s young persons staying at school for secondary education. It became clear 
that the language in which science is expressed and recorded in text books posed a problem for its 
learning for many of this wider student population. Gardner [30] in Australia and Johnstone and 
Cassels [31] in Scotland drew attention to the fact that this was not just a matter of the highly technical 
words of science, but much more about the many everyday words that, in say, Standard English 
(or any other language,) take on quite precise and different meanings in the discourse of science. 

Gardner et al. [32] examined the special linguistic character of science with a study of junior 
secondary students’ ability to comprehend the meaning of and to use the great variety of logical 
connective words (LCA) used in science texts and science assessment items to link a proposition with 
another idea. They found a significant relationship between students’ LCA knowledge and 
socio-economic status. Words like thus, respectively, in addition, and hence are comprehended 
differently among students of different SES. 

Sutton [33] in England and Munby [34] in Canada drew attention to the centrality of language in 
science classrooms, but it was Lemke [35,36] who drew the attention of linguists such as Halliday and 
Martin [37] to the more linguistic aspects of science that now are a very active area of research 
interest. Inevitably this recognition of the complexity of science language heightens concern that 
equity issues associated with language learning itself will also be significant in students’ science 
learning and in their responses to changes that occur in science assessment. 

An obvious consequence of these language studies was the debate about the validity of written tests 
and their wording to measure scientific understanding. Harlen [38] and Murphy [39] suggested that 
inadequate performance on such tests did not necessarily reflect lack of learning, since the purpose of 
the assessor’s task may be read differently by students, and within different groups of students, again 
opening the possibility of equity issues. Harlow and Jones [40] in New Zealand followed up 
suggestions by Messick [41] and Fensham [42] that interviewing students may illuminate the processes 
that underlie item response and task performance. Harlow and Jones administered 24 science items 
from the TIMSS test to a population of Year eight students and scored these using the TIMSS rubrics. 



Ecluc. Sci. 2013, 3 


333 


A large number of this population, representative of gender and achievement levels, and including a 
small number of two ethnic minority groups—Maori and Asian—were then interviewed about the test 
questions, their written responses to them, and the strategies they used to answer the items. 

The students’ range of scores on the written test were similar to those of the NZ national sample for 
TIMSS, but when the interview results were compared with the written responses, the overall scores 
increased in three of the five science content areas. Among the sub-groups, NZ European boys and 
girls, and Maori and Asian girls showed significant increases, that is, many students were able to 
demonstrate more science knowledge through oral responses than on the test forms. Results for Maori 
boys showed little or no difference in the two assessment contexts, perhaps because English was the 
medium of the interview. 

The interviews indicated that there was very often something correct in incorrect written responses. 
For 14 of the test items students had more knowledge than they had written, although for seven items 
students who had “correct” written responses did not have an understanding of the concept being 
assessed. Of more interest are the ways the written test tasks were interpreted by students. 
For example, in a free response item that required a reason to be stated, many students did not give a 
reason in the written test, but half of them gave the reason in the interview. Again, a large number of 
students were found to have misinterpreted the task of one of the written items, confirming Harlen and 
Murphy’s expectation. Similar misinterpretation of written items has been reported by other 
researchers [43-45]. A word, a phrase, or diagram, that is part of the question or a component in a set 
of multiple choice options can be the source of misinterpretation that means students’ knowledge of 
the science may not be elicited. 

Rather than heeding these warnings, written assessment in science remains the dominant form of 
international, national and local assessment of science learning and the amount of writing involved has 
increased substantially as context-based science becomes more usual (encouraged by the example of 
the OECD’s PISA project and its use of more free response items requiring articulated prose). 
A feature of successive PISA Science studies has been the small gender difference in the students’ 
science scores in many countries although the PISA Reading studies universally show that girls have 
much higher achievement than boys on their test items [20]. This finding has enabled critics of PISA, 
such as Sjoberg [46], to argue that gendered reading ability overshadows the PISA Science test’s 
attempt to measure scientific knowledge. Although he misses the point that PISA Science is attempting 
to measure the students’ ability to put their knowledge into practice in real world contexts, and not 
simply their recall of static science knowledge, he does highlight the fact that to measure this student 
ability requires linguistically more complex written items and responses. 

Not surprisingly, the PISA reports consistently show that (a) students born outside the country of 
testing and with parents born outside; (b) students from lower socio-economic backgrounds; and 
(c) minority indigenous students (compounding cultural ways of knowing science, discussed in the 
next section), perform at lower levels than their culturally different and linguistically advantaged 
peers. National written tests for science confirm these findings. For the first of these equity shortfall 
groups the findings appear to be transitory, with second-generation immigrants being less affected. 
The other two equity group differences are more difficult to redress in terms of language literacy and 
its flow on effect in science learning. 
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For the case of inequity among Australia’s Indigenous populations, McTaggart and Curro [47] 
noted that many of these students do not speak or hear Standard English outside the classroom. The 
standard language of the classroom is for them a second or third language, and in the science 
classroom they face even more subtle differences. Davidson [48] has found that such students may 
have the requisite knowledge and skills, but their lack of ability in the Standard English of assessment 
tasks is a barrier to their communication of their knowledge, including their understanding of multiple 
choice items. 

Kaesehagen et al. [49] also found evidence for this literacy constraint, especially when the 
assessment task is phrased in two parts, common in mathematics and science tests, reflecting the 
research on LCA of Gardner et al. and the work of Harlow and Jones above. Thus, at the lower 
levels of schooling where assessment can be more formative in purpose, oral assessment of science 
knowledge may be more equitable than the current written modes. This in itself, however, may require 
teachers to develop new language skills. One solution would be to make greater use of multiple 
classroom-based evidence sources to demonstrate students’ achievement. Digital technologies and 
e-portfolios may provide one way to record the science learning of all students more equitably. 

5. Assessment of Science and Culture 

The previous discussion focused on language in science from a linguistic perspective, and some 
interactions between language background and culture. Language and culture can have a much deeper 
interaction—the formation of science knowledge and experience, and demonstration of scientific 
knowledge in science assessments [50,51]. A core issue in standardized tests and international 
comparison tests is that, while they require test items to have cultural equivalence, this may not be 
establishable. Student achievement must be culturally-contextualized. Science is not “culture free” [50]. 

Much of the current research examining cultural difference compares achievements of groups such 
as Asian students, Asian-American students and other students, on the basis that the first two groups 
outperform other students on tests such as TIMSS and PISA or national or state-based science 
assessments. Simplistic interpretations of science achievement data that attribute differences to race or 
culture are flawed, however. Most differences, as noted, can be identified in terms of opportunity to 
learn, and the social capital of prior experience, the language of science and testing, and resources in 
the home. The higher performance of Asian students is often attributed to membership of a culture that 
values education, but may also be attributed to selectivity of schools and a highly competitive 
environment. By contrast, Asian countries such as China, Hong Kong, Japan and Singapore have been 
looking to the West to change pedagogy and assessment practices to focus more on problem-based 
learning and assessment and use of authentic and situated curriculum and assessment, and less on 
rote-based instruction. Comparison of science participation and achievement of Black and Hispanic 
students in the US with performance of Caucasian students are also a focus, as the last group 
significantly outperforms the first two. Again, differences are not due to racial background, but to the 
impact of social disadvantage, lack of resources in home and school, quality of teachers, and 
familiarity with standard language that many students from these cultural groups experience. 

Validity of assessment for students from diverse cultural backgrounds encompasses a much deeper 
issue than disadvantage in socio-economic status. Different groups, especially indigenous groups, have 
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different ways of relating to natural phenomena and the environment, and science. Work in Australia 
with students at risk, especially Australian Indigenous students, has initiated place-based science 
education, working in the environment, with communities, and embedding Indigenous ways of 
knowing science [52]. Students’ learning is assessed through performative assessments including 
digital photography, webpages and collaborative effort. Different ways of assessing have been 
designed to reflect different ways of knowing and learning and of interacting with others in 
the community. 

The Australian program reflects the work of Nelson-Barber, Solano-Flores, Estrin and 
Trumbull [53-55] who have explored Indigenous ways of knowing in science for American Indians 
and valid assessment for more than two decades. These researchers highlight that building on the prior 
knowledge of American Indian students is necessary not only to link their experiential understanding 
with the Western science curriculum but also to recognize and value the prior and cultural scientific 
knowledge that these students have. Despite their early work, the researchers note that more recent 
policies such as NCLB entrench cultural bias and marginalize students from different cultures. 
Culturally-based curricula and pedagogy require culturally-valid assessments, with more work needed 
on developing such instruments [54]. “Cultural validity” should be a core assessment concept to ensure 
equity for students in science that avoids an assimilationist perspective [54,55]. Aspects of 
standardized science assessment administrations that may be culturally-inappropriate include language, 
on-demand expectations, and instructions [55]. 

Nelson and Estrin [53] indicated that American Indian ways of knowing should be incorporated in 
science standards (curriculum expectations) and assessed in context, consistent with fundamental 
constructivist approaches to science education The new Australian Curriculum has implemented a 
cross-curriculum priority of Aboriginal and Torres Strait Islander histories and cultures. As an 
example, a science curriculum component for Year seven students includes “investigating how 
Aboriginal and Torres Strait Islander knowledge is being used to inform scientific decisions, 
for example care of waterways” [56]. How this cross-curriculum priority could be incorporated in or 
could inform forms of assessment, whether the priority creates an assimilist perspective of culture 
rather than recognizing cultural difference, and how the inclusion of such content will affect the 
science learning of Aboriginal and Torres Strait Islander students, are yet to be examined. Overall, 
empirical research has not compared the nature and depth of the scientific knowledge that students 
from indigenous backgrounds hold and can demonstrate in culturally-appropriate assessments versus 
standardized western culture tests. As Lynch noted, ““science for all” is not equal to “one size fits all”” 
([57], p. 622). “One-size-fits-all” assessments are not equitable. 

6. Students with Disability and Assessment 

Equity, bias and assessment for students with a disability are relatively recent areas of assessment 
research as in the past many students with a disability, particularly severe physical or intellectual 
impairment, did not attend school or were educated in special institutions. Often the learning 
expectations for students were not high. The move to inclusive schooling, with as many students with a 
disability as possible learning in “mainstream” classrooms with peers without a disability, has seen a 
change in these expectations. A major purpose of NCLB, as noted, was for teachers to have high 
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expectations for all students. Accountability for learning outcomes for students with a disability has 
stimulated recent research in appropriate assessment modes and approaches to allow students with a 
disability to demonstrate their knowledge in areas such as science. Consider how traditional 
assessment modes in school could impact on the capacity of Stephen Hawking to demonstrate his 
scientific excellence. 

Disability is defined very broadly in most nations, and by the OECD [58], to include physical, 
emotional and intellectual conditions. Such conditions vary in severity and impact. Disability includes 
dyslexia and autism spectrum disorder as well as visual impairment or hearing impairment. Most 
definitions include learning difficulties that affect the rate of progress of learning but are less easy to 
classify than other disabilities. Systems and teachers need to consider science education for these 
students, how assessment should occur, and most importantly how assessments can be framed to 
enable students to demonstrate their science knowledge in the most enabling way. 

A common approach to assessment of students with a disability is to provide a range of 
“accommodations” to support the student with a disability to use the same assessment form or test used 
by other students. 

Standard forms of accommodation include enlarged print or Braille versions for students with 
vision impairment, amanuenses for students unable to write, assistive technologies, and time 
allowances (more time, breaks). We noted at the beginning of this article that for the layperson, the 
concept of “fair” is all doing the same assessment. Clearly this view would not extend to an 
expectation that a student who is blind would read a written test paper. Immediately, some change to 
the assessment must occur. 

Contentious issues in accommodations for students with a disability are the provision of additional 
time allowances and the reading of assessment information to students. The former is the question of 
how much time should be allowed, to notionally put the student with a disability on an equitable 
footing in assessment with students without a disability. The issue raised in the reading of assessment 
items is whether reading skill or comprehension of language is the knowledge being assessed, or 
whether the accommodation “fundamentally alter[s] the nature” of the test [59]. Given the previous 
discussion of the role that language and linguistics play in science curriculum and assessment, this is 
clearly an important issue for students with a disability. The concern in the provision of these types of 
assessment accommodations, often expressed, is that students with a disability may gain an unfair 
advantage over students without a disability [10]. The concern never appears to be expressed that 
students with a disability may already have cultural, experiential or linguistic disadvantage. Any equity 
concern that additional time may give an advantage to students with a disability is easily addressed. 
Research demonstrates that all students may gain from additional times or extra adjustments in 
assessment, not only students with a disability but also students from diverse language and cultural 
backgrounds [60,61]. Science assessments should focus on the scientific knowledge and related skills 
being assessed, not on speed of performance. As Sireci, Scarpati and Li [60] noted, if all students do 
better with more time, it is not the time accommodation that is unfair, but that the time conditions 
imposed on all students could be too stringent. 

Equity for students with a disability is still a major issue for the international programs, TIMSS 
and PISA, that assess student science performance. These tests exclude students who are 
“intellectually or physically disabled” and cannot “perform in the ... testing situation” from inclusion 
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in the sampling ([62], p. 25). Only small, but varying, percentages of students are excluded from each 
country for these reasons [62], fewer than would be expected on the basis of students with a reported 
[reportable] disability in each country. While a shortened version of the test is available for students 
with a disability and more flexible test-taking conditions are allowed, few countries have opted to use 
this alternative [63]. 

No guidelines are provided with the PISA and TIMSS tests regarding accommodations for students 
with a disability, and the expectation is that countries will need to exclude students who would be very 
difficult or resource intensive to test [64]. The consequence for most countries is that unless improper 
sampling is occurring, students with a disability will be taking the standard test forms. In countries 
such as Australia, with strong policies of inclusion of all students in mainstream schooling, this may be 
a further explanation for the negative tail in the results that are solely identified as relating to 
disadvantage and ethnicity [65]. In recognition that these international science studies do not address 
the worldwide inclusive education agenda and are not fair for students with a disability, 
accommodations for future administrations are being investigated [66]. 

As NCLB provides that students with a disability may be assessed using alternative modes of 
assessment, alternative assessments and establishing their equivalence to standard test forms, have 
become a focus of considerable recent research in the US. These can include interviews, digital 
recordings, validated teacher observations or checklists, presented as evidence of student learning 
through portfolios [67]. 

Research-based evidence on appropriate assessment modes and conditions for students with a 
disability is scant [10,68], with little recent research looking at the interaction between different forms 
of disability, student knowledge, and achievement through assessment accommodations such as time 
allowances. One promising area of research emerging from the US is adjustments to multiple choice 
items to modify the cognitive demands of the item for students with intellectual disability. Contrary to 
stereotypes that may exist, students with intellectual disability can develop conceptual understanding. 
Their difficulty in learning and assessment may be limited memory, particularly working memory 
capacity [69-71]. The cognitive demands, or load, of a multiple choice item may mask whether a 
student with disability has scientific or mathematical understanding. Cognitive demands also reflect 
the language and linguistics demands of science test items previously discussed for students of 
different language backgrounds. Research by Elliott and colleagues [72,73] has demonstrated that 
items can have a reduced number of distractors and simplified language structures, while maintaining 
reliability and validity in assessing the intended conceptual learning focus. 

7. Conclusions 

In the heady euphoria of late 1960s Robert Hein, Director of the Elementary Science Studies project 
in the USA, claimed that science should be the easiest subject to teach in primary school because it 
only required observation and talking—powers that the great majority of young learners already could 
bring to school. This naturalist view of science may well still be a good way to start school science, but 
it does not get learners far into the abstract language of science that has been invented to systematize 
and explain these observations. Furthermore, visual observation is only one methodological tool that 
science uses in its ongoing investigation. 
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Although the abstractness of science content can mean that learning science is less linked to prior 
social capital than a number of other subjects, its language is to a greater of lesser extent a foreign 
language, and this makes it susceptible to variation in social capital and hence to a lack of equity 
across all students. Furthermore, how this language is expressed in the assessment of science 
learning—all too often written questions and written responses—can affect the equity of its measure of 
understanding science. The numerous examples of inequity that are discussed above concerning the 
style, format and modes of assessment in science call for much greater variety in the presentation of 
assessment tasks and in the manner of response by which students provide evidence of their learning. 

Among the more developed countries there has been concern since about 2000 about negative 
attitudes towards science among students, and existing approaches to the assessment of its learning in 
school has given it the reputation of a “difficult” subject. This is counterproductive at just the time 
when a sound basis in science and the way its knowledge is increasingly interacting for good or ill with 
the lives of individuals, and societies and as international citizens. Newer science curricula encourage 
and expect science to be taught and learnt in this interactive sense. Their intentions for learning offer 
the opportunity for students to develop interest in, and to learn science from the situations in their own 
personal experience of these interactions. These opportunities open science to the widest populations 
of students regardless of their associated sources of educational advantage and disadvantage. 

These curricula need to be matched by explicit delineation by curriculum authorities of the 
associated learning expectations, and of the assessment policies and practices that address the various 
sources of inequity to which we have drawn attention. Only then will we begin to move in science 
education from Which Child Left Behind? towards No Child Left Behind. As curriculum authorities 
take such official action, teachers will be encouraged and, indeed required, to use a variety of assessment 
tasks and modes for their classroom formative and summative assessments. These will optimize for all 
students both their ongoing engagement with and their learning of science as powerful and useful 
knowledge for living in the diverse and inter-connected biophysical world we all share. 

This article has demonstrated that assessment in science is vulnerable to capturing social capital of 
students from diverse family backgrounds that relates to the gender of students, their cultural 
experiences, their language, and their capacity to engage with assessment demands. The discussion 
shows that the issues identified in this article have been raised over some past decades, usually without 
resolution. Trends to national and international standardized measures of science, and comparisons of 
performance on these, may be exacerbating rather than addressing these many issues of equity in 
science achievement. 

Most comparative research examining differences in student achievement works from the gold 
standard of performance on assessments that reflect Western science curriculum and the language and 
experiences of students who speak the first language of their country and are from well-resourced 
homes. Implicitly or explicitly, sources of inequity in assessment result in difference being identified 
as a deficit from this norm, not in terms of scientific knowledge and understanding. 

This discussion has demonstrated that difference may reflect bias in the assessment tasks, and in the 
curriculum’s construction. Attempts to remove or control for the cultural context of a student in 
assessment do not result in a culture-free zone [15,74]. 

A different perspective is to take each group as representing its own gold standard. The task for 
research in science assessment is to identify the nature of the science that the students know, its depth. 
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its richness, and to identify the construct-irrelevant variance due to the nature of an assessment task. 
Much more empirical research is needed into how students know and understand science, and the 
nature of this science, viewing language, gender, cultural experience constructively. As Luykx and 
colleagues noted, such research requires fine-grained qualitative analysis, necessarily with small 
groups of students, along with collaboration across disciplinary boundaries of “science educators, 
assessment specialists, linguists, anthropologists, discourse analysts, statisticians, and ... 
others’ ([50], p. 920). 
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